Skip to content

feat: upgrade otel dependencies with backward compatible semconv attributes#2714

Open
Noroth wants to merge 39 commits intomainfrom
ludwig/eng-9268-router-upgrade-otel-dependencies-and-implement-semconv
Open

feat: upgrade otel dependencies with backward compatible semconv attributes#2714
Noroth wants to merge 39 commits intomainfrom
ludwig/eng-9268-router-upgrade-otel-dependencies-and-implement-semconv

Conversation

@Noroth
Copy link
Copy Markdown
Contributor

@Noroth Noroth commented Mar 31, 2026

This PR is the first step of our OTEL upgrade. It ensures that the dependencies are upgraded to the latest version while we are still able to drop/rewrite any newly introduced/altered attributes. This is necessary because various telemetry systems currently rely on the attributes emitted by the router.

In the future we will need to slowly migrate to the new attributes

Summary by CodeRabbit

  • Chores

    • Updated Go toolchain and bumped many dependencies and build tools (telemetry, metrics, gRPC, protobufs, Prometheus, JWT).
  • Improvements

    • Reduced telemetry noise by dropping HTTP-instrumentation metrics and filtering specified span attributes.
    • Remapped semantic-convention attribute keys for downstream compatibility.
    • Added HTTP request/response content-length attributes and adjusted OTLP exporter defaults for traces/metrics.
  • Tests

    • Added telemetry test helpers and expanded/robustified telemetry and metric tests.

Checklist

Open Source AI Manifesto

This project follows the principles of the Open Source AI Manifesto. Please ensure your contribution aligns with its principles.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 31, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Updated Go toolchain and many module versions; introduced attribute-filtering tracer provider and semconv remapping; dropped otelhttp-scope metrics; set HTTP content-length attributes on client spans; and adjusted telemetry tests and metric cardinality handling.

Changes

Cohort / File(s) Summary
Go module files
demo/go.mod, demo/pkg/subgraphs/projects/go.mod, graphqlmetrics/go.mod, router-plugin/go.mod, router-tests/go.mod, router/go.mod
Bumped go directive to 1.25.0 and upgraded many dependencies (OpenTelemetry, Prometheus, gRPC/protobuf, golang.org/x/*, etc.); replaced github.com/cenkalti/backoff/v4v5; removed/adjusted prior OpenTelemetry replace pins.
Router tracing core
router/pkg/trace/filtering_provider.go, router/pkg/trace/semconv_processor.go, router/pkg/trace/meter.go, router/pkg/trace/handler.go
Added FilteringTracerProvider to strip configured semantic-convention attributes at Start/SetAttributes; added semconvProcessor SpanProcessor to remap newer semconv keys to legacy names on span end; wired these into tracer construction and handler middleware.
HTTP transport & metrics filtering
router/core/transport.go, router/pkg/trace/transport.go, router/pkg/metric/meter.go
Wrapped otelhttp tracer provider with the filtering provider; client RoundTrip now sets HTTP request/response content-length attributes on spans; metric view logic drops metrics emitted from the otelhttp instrumentation scope for Prometheus and OTLP exporters.
Metric store & tests
router/pkg/metric/metric_store.go, router/pkg/metric/config.go, router/pkg/metric/metric_store_test.go
Moved cardinality handling to SDK option metric.WithCardinalityLimit, normalized default/negative limits, removed env-based limit tests; added explicit OTLP/Prometheus cardinality tests, helpers, and concurrency adjustments; renamed oltpMetricsotlpMetrics.
Exporter naming & defaults
router/core/router.go, router-plugin/config/settings.go, router-plugin/tracing/tracer.go, router/pkg/otel/otelconfig/otelconfig.go, router/pkg/trace/meter.go, router/pkg/trace/config.go
Renamed exported constants from ExporterOLTP*ExporterOTLP* and updated default exporter selections and switch cases to use the new names (defaulting to OTLP HTTP).
Telemetry tests & helpers
router-tests/telemetry/telemetry_helpers_test.go, router-tests/telemetry/telemetry_test.go, router/pkg/trace/middleware_test.go, router/pkg/trace/tracer_test.go, router/pkg/trace/transport_test.go
Added assertHasAttributes and debug printer; tests now ignore exemplars, assert span attributes via sets, select datapoints by attribute (order-independent), and use semconvProcessor/FilteringTracerProvider in test tracer providers; updated some expectations.
Prometheus translation & metrics
graphqlmetrics/pkg/telemetry/prometheus_metrics.go
Replaced otelprom.WithoutUnits() with otelprom.WithTranslationStrategy(otlptranslator.UnderscoreEscapingWithSuffixes) to change metric name/unit translation.
Build tooling
Makefile
Updated setup-build-tools to install newer protoc-gen-go (v1.36.11) and protoc-gen-connect-go (v1.19.1).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.61% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: upgrading OpenTelemetry dependencies while maintaining backward-compatible semantic convention attributes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 31, 2026

Dependency Review

The following issues were found:

  • ❌ 2 vulnerable package(s)

View full job summary

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 31, 2026

Router-nonroot image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-0fb550abbef78d13112aa17fe363eef8f38c2b70-nonroot

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 47.57709% with 119 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.68%. Comparing base (b39e30e) to head (d60cb40).
⚠️ Report is 47 commits behind head on main.

Files with missing lines Patch % Lines
...to/wg/cosmo/graphqlmetrics/v1/graphqlmetrics.pb.go 16.66% 44 Missing and 1 partial ⚠️
...to/wg/cosmo/graphqlmetrics/v1/graphqlmetrics.pb.go 16.66% 36 Missing and 9 partials ⚠️
router/internal/retrytransport/retry_transport.go 56.25% 4 Missing and 3 partials ⚠️
router/pkg/trace/filtering_provider.go 80.64% 3 Missing and 3 partials ⚠️
...oto/wg/cosmo/node/v1/nodev1connect/node.connect.go 0.00% 4 Missing ⚠️
.../graphqlmetricsv1connect/graphqlmetrics.connect.go 50.00% 3 Missing ⚠️
router/pkg/trace/meter.go 25.00% 3 Missing ⚠️
router/core/router.go 0.00% 2 Missing ⚠️
...phqlmetrics/gen/proto/wg/cosmo/common/common.pb.go 50.00% 1 Missing ⚠️
router/gen/proto/wg/cosmo/common/common.pb.go 50.00% 1 Missing ⚠️
... and 2 more

❌ Your patch check has failed because the patch coverage (47.57%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2714       +/-   ##
===========================================
+ Coverage   41.55%   64.68%   +23.13%     
===========================================
  Files         785      270      -515     
  Lines      112455    27424    -85031     
  Branches     8667        0     -8667     
===========================================
- Hits        46730    17740    -28990     
+ Misses      65362     8262    -57100     
- Partials      363     1422     +1059     
Files with missing lines Coverage Δ
.../graphqlmetricsv1connect/graphqlmetrics.connect.go 78.26% <100.00%> (ø)
graphqlmetrics/pkg/telemetry/prometheus_metrics.go 81.25% <100.00%> (ø)
router/core/transport.go 91.25% <100.00%> (ø)
router/gen/proto/wg/cosmo/node/v1/node.pb.go 28.89% <ø> (ø)
router/pkg/metric/config.go 67.50% <100.00%> (ø)
router/pkg/metric/metric_store.go 78.42% <100.00%> (ø)
router/pkg/otel/otelconfig/otelconfig.go 28.57% <ø> (ø)
router/pkg/trace/config.go 75.00% <100.00%> (ø)
router/pkg/trace/handler.go 85.71% <100.00%> (ø)
router/pkg/trace/transport.go 100.00% <100.00%> (ø)
... and 12 more

... and 1033 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (7)
router-tests/telemetry/telemetry_helpers_test.go (2)

14-19: Consider simplifying: HasValue check is redundant.

The attributes.Value(key) call on line 16-17 already returns ok = false when the key doesn't exist, making the HasValue check on line 15 redundant.

♻️ Suggested simplification
 func assertHasAttributes(t *testing.T, attributes attribute.Set, expectedAttributes ...attribute.KeyValue) {
 	t.Helper()
 
 	for _, expectedAttribute := range expectedAttributes {
-		assert.True(t, attributes.HasValue(expectedAttribute.Key))
 		value, ok := attributes.Value(expectedAttribute.Key)
-		assert.True(t, ok)
+		assert.True(t, ok, "expected attribute %s to exist", expectedAttribute.Key)
 		assert.Equal(t, expectedAttribute.Value, value)
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_helpers_test.go` around lines 14 - 19, The
HasValue check is redundant because attributes.Value(expectedAttribute.Key)
already returns ok=false for missing keys; remove the call to
attributes.HasValue and its assertion, and instead call value, ok :=
attributes.Value(expectedAttribute.Key) directly, assert.True(t, ok) to ensure
the key exists, then assert.Equal(t, expectedAttribute.Value, value); update the
loop in the test (referring to attributes, expectedAttribute, HasValue, and
Value) accordingly to simplify the assertions.

22-26: Debug helper: AsString() returns empty string for non-string attribute types.

This is fine for debugging purposes, but be aware that AsString() returns an empty string for non-string attribute types (Int, Float, Bool, etc.). If you need to debug non-string attributes, consider using attribute.Value.Emit() or type-specific getters.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_helpers_test.go` around lines 22 - 26, The
debug helper printAttributeNames uses attribute.Value.AsString(), which yields
an empty string for non-string types; update printAttributeNames to handle
non-string attributes by using attribute.Value.Emit() or the type-specific
getters (e.g., Int, Bool, Double) or a type switch on attribute.Value.Type to
produce meaningful output for numeric/boolean attributes so debugging shows
actual values instead of blanks.
router/go.mod (1)

35-36: Stale comment contradicts the actual change.

The comment says "Do not upgrade, it renames attributes we rely on" but the line immediately below shows otelhttp v0.67.0 which is the upgraded version. This comment appears to be leftover from before this PR and is now misleading.

Consider removing or updating this comment to reflect the new strategy (using FilteringTracerProvider to maintain backward compatibility):

-	// Do not upgrade, it renames attributes we rely on
+	// Upgraded with backward-compatible attribute filtering via FilteringTracerProvider
 	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/go.mod` around lines 35 - 36, The inline comment stating "Do not
upgrade, it renames attributes we rely on" is stale and contradicts the actual
module line "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp
v0.67.0"; update or remove that comment so it reflects the new approach using
FilteringTracerProvider (or simply delete the warning) next to the "otelhttp
v0.67.0" entry to avoid confusion for future readers.
router-tests/telemetry/telemetry_test.go (3)

11146-11148: Don’t assert absence on only DataPoints[0].

These checks only inspect the first histogram datapoint. If multiple datapoints exist, the test can miss a violating datapoint carrying the forbidden key.

💡 Suggested fix pattern
- atts := subgraphNonMetric.Data.(metricdata.Histogram[float64]).DataPoints[0].Attributes
- _, ok := atts.Value(attribute.Key(key))
- require.False(t, ok)
+ for _, dp := range subgraphNonMetric.Data.(metricdata.Histogram[float64]).DataPoints {
+   _, ok := dp.Attributes.Value(attribute.Key(key))
+   require.False(t, ok)
+ }

Also applies to: 11214-11216, 11296-11298

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_test.go` around lines 11146 - 11148, The
test currently only inspects the first histogram datapoint
(subgraphNonMetric.Data.(metricdata.Histogram[float64]).DataPoints[0]) when
asserting the forbidden attribute is absent; update the check to iterate over
all datapoints in the DataPoints slice and assert that for each datapoint's
Attributes the attribute.Key(key) is not present (use the same attribute lookup
pattern and require.False/assertion per datapoint) so no datapoint can carry the
forbidden key; apply the same change where similar single-index checks appear
(the other instances noted).

4435-4441: Clean up commented-out assertions.

These commented require.Contains lines are obsolete after assertHasAttributes and should be removed to keep the test readable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_test.go` around lines 4435 - 4441, Remove
the obsolete commented assertions (the lines starting with "//
require.Contains(t, sn[0].Attributes(), ...") since they duplicate checks now
covered by assertHasAttributes; locate them near the telemetry test where
assertHasAttributes is used and delete those commented require.Contains lines so
the test file is cleaner and only relies on assertHasAttributes for the
attribute checks.

7965-7965: Remove leftover debug output from test path.

Line 7965 prints attribute names unconditionally, which adds noise to CI logs and makes failures harder to triage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_test.go` at line 7965, Remove the leftover
debug print by deleting the call to printAttributeNames(sn[2].Attributes()) in
the test; if you need the info for debugging keep it behind the test logger
(e.g., use t.Log/T.Logf) or a conditional debug flag, but do not leave an
unconditional printAttributeNames invocation that writes to stdout in the test
path.
router/pkg/trace/filtering_provider.go (1)

35-37: Prefer OTEL's embedded interfaces in these wrappers.

FilteringTracerProvider, filteringTracer, and filteringSpan embed oteltrace.TracerProvider, oteltrace.Tracer, and oteltrace.Span directly. The OTEL trace docs explicitly warn that these interfaces may gain methods in minor releases and recommend embedding go.opentelemetry.io/otel/trace/embedded (or trace/noop) instead, so upgrades fail predictably instead of turning into runtime panics. (pkg.go.dev)

Also applies to: 43-45, 84-86

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/filtering_provider.go` around lines 35 - 37, Replace direct
embedding of OTEL interfaces with OTEL's embedded variants to avoid breakage
when methods are added: change the structs FilteringTracerProvider,
filteringTracer, and filteringSpan to embed
go.opentelemetry.io/otel/trace/embedded versions (e.g., embedded.TracerProvider,
embedded.Tracer, embedded.Span or the trace/noop equivalents) instead of
oteltrace.TracerProvider, oteltrace.Tracer, and oteltrace.Span; update imports
to reference the embedded package and ensure method forwarding remains correct
so the wrappers still satisfy the same interfaces.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@router-plugin/go.mod`:
- Line 17: Update the grpc dependency entry for google.golang.org/grpc in go.mod
from v1.79.2 to v1.79.3 to pull in the security fix for GHSA-p77j-4mvh-x3m3;
after changing the version string, run dependency resolution (e.g., go get
google.golang.org/grpc@v1.79.3 and go mod tidy) and verify builds/tests that
reference the gRPC client/server symbols to ensure no regressions.

In `@router-tests/telemetry/telemetry_test.go`:
- Line 5524: The test is indexing rm.ScopeMetrics by position after already
resolving the scope by name, which reintroduces order coupling; update the
assertion to use the previously resolved scopeMetric (e.g.,
scopeMetric.Metrics[6]) or locate the specific metric within scopeMetric.Metrics
by name before calling metricdatatest.AssertEqual, keeping the assertion target
consistent with the earlier lookup (use scopeMetric instead of
rm.ScopeMetrics[0] when asserting routerInfoMetric with
metricdatatest.AssertEqual).

In `@router/pkg/trace/filtering_provider.go`:
- Around line 88-96: filteringSpan.SetAttributes currently compacts the variadic
kv slice in place which mutates caller-owned backing storage; instead, allocate
a new slice (e.g., make([]attribute.KeyValue, 0, len(kv))) or preallocate
filtered of length n and append acceptable entries from kv while checking
semconvDropKeys, and then call s.Span.SetAttributes(filtered...) so the original
kv is not modified; update the function to use this new filtered slice and
remove in-place writes to kv.
- Around line 67-77: The current rebuild loop in filtering_provider.go
incorrectly drops all SpanStartEventOption implementations (notably
WithTimestamp) when filtering attributes; update the rebuild logic in the span
start path (the code that mutates opts and uses cfg) to preserve explicit start
options by extracting the preserved values from cfg (timestamp via
ContextWithStartTime/WithTimestamp, span kind, new root flag, links) and then
reconstruct opts: copy all non-attribute SpanStartOption values, re-add
oteltrace.WithAttributes(filtered...), and explicitly re-add the preserved
options (e.g., oteltrace.WithTimestamp(timestamp) if set,
oteltrace.WithSpanKind(kind) if set, oteltrace.WithNewRoot(true) if set, and any
links) so WithTimestamp and other important start options are not lost.

In `@router/pkg/trace/transport.go`:
- Around line 49-52: The code calls span.SetAttributes with
semconv.HTTPRequestContentLength(int(r.ContentLength)) (and similarly for
res.ContentLength) which can truncate int64 to int on 32-bit systems or for very
large payloads; update the handling in the functions that use r.ContentLength
and res.ContentLength (e.g., where otrace.SpanFromContext is used and
span.SetAttributes is invoked) to defensively check the int64 value before
casting: if the value fits within int (use math.MaxInt/int64 comparisons) cast
normally, otherwise clamp to math.MaxInt or use a semantic attribute helper that
accepts int64 if available, and include a short comment explaining the clamp to
avoid silent truncation.

---

Nitpick comments:
In `@router-tests/telemetry/telemetry_helpers_test.go`:
- Around line 14-19: The HasValue check is redundant because
attributes.Value(expectedAttribute.Key) already returns ok=false for missing
keys; remove the call to attributes.HasValue and its assertion, and instead call
value, ok := attributes.Value(expectedAttribute.Key) directly, assert.True(t,
ok) to ensure the key exists, then assert.Equal(t, expectedAttribute.Value,
value); update the loop in the test (referring to attributes, expectedAttribute,
HasValue, and Value) accordingly to simplify the assertions.
- Around line 22-26: The debug helper printAttributeNames uses
attribute.Value.AsString(), which yields an empty string for non-string types;
update printAttributeNames to handle non-string attributes by using
attribute.Value.Emit() or the type-specific getters (e.g., Int, Bool, Double) or
a type switch on attribute.Value.Type to produce meaningful output for
numeric/boolean attributes so debugging shows actual values instead of blanks.

In `@router-tests/telemetry/telemetry_test.go`:
- Around line 11146-11148: The test currently only inspects the first histogram
datapoint (subgraphNonMetric.Data.(metricdata.Histogram[float64]).DataPoints[0])
when asserting the forbidden attribute is absent; update the check to iterate
over all datapoints in the DataPoints slice and assert that for each datapoint's
Attributes the attribute.Key(key) is not present (use the same attribute lookup
pattern and require.False/assertion per datapoint) so no datapoint can carry the
forbidden key; apply the same change where similar single-index checks appear
(the other instances noted).
- Around line 4435-4441: Remove the obsolete commented assertions (the lines
starting with "// require.Contains(t, sn[0].Attributes(), ...") since they
duplicate checks now covered by assertHasAttributes; locate them near the
telemetry test where assertHasAttributes is used and delete those commented
require.Contains lines so the test file is cleaner and only relies on
assertHasAttributes for the attribute checks.
- Line 7965: Remove the leftover debug print by deleting the call to
printAttributeNames(sn[2].Attributes()) in the test; if you need the info for
debugging keep it behind the test logger (e.g., use t.Log/T.Logf) or a
conditional debug flag, but do not leave an unconditional printAttributeNames
invocation that writes to stdout in the test path.

In `@router/go.mod`:
- Around line 35-36: The inline comment stating "Do not upgrade, it renames
attributes we rely on" is stale and contradicts the actual module line
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0"; update
or remove that comment so it reflects the new approach using
FilteringTracerProvider (or simply delete the warning) next to the "otelhttp
v0.67.0" entry to avoid confusion for future readers.

In `@router/pkg/trace/filtering_provider.go`:
- Around line 35-37: Replace direct embedding of OTEL interfaces with OTEL's
embedded variants to avoid breakage when methods are added: change the structs
FilteringTracerProvider, filteringTracer, and filteringSpan to embed
go.opentelemetry.io/otel/trace/embedded versions (e.g., embedded.TracerProvider,
embedded.Tracer, embedded.Span or the trace/noop equivalents) instead of
oteltrace.TracerProvider, oteltrace.Tracer, and oteltrace.Span; update imports
to reference the embedded package and ensure method forwarding remains correct
so the wrappers still satisfy the same interfaces.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b49a4b2d-9428-4416-a9d5-d32d3f070e59

📥 Commits

Reviewing files that changed from the base of the PR and between dc4388d and d7185d9.

⛔ Files ignored due to path filters (5)
  • demo/go.sum is excluded by !**/*.sum
  • graphqlmetrics/go.sum is excluded by !**/*.sum
  • router-plugin/go.sum is excluded by !**/*.sum
  • router-tests/go.sum is excluded by !**/*.sum
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • demo/go.mod
  • graphqlmetrics/go.mod
  • router-plugin/go.mod
  • router-tests/go.mod
  • router-tests/telemetry/telemetry_helpers_test.go
  • router-tests/telemetry/telemetry_test.go
  • router/core/transport.go
  • router/go.mod
  • router/pkg/metric/meter.go
  • router/pkg/trace/filtering_provider.go
  • router/pkg/trace/handler.go
  • router/pkg/trace/meter.go
  • router/pkg/trace/semconv_processor.go
  • router/pkg/trace/transport.go

Comment thread router-plugin/go.mod Outdated
Comment thread router-tests/telemetry/telemetry_test.go
Comment thread router/pkg/trace/filtering_provider.go Outdated
Comment thread router/pkg/trace/filtering_provider.go
Comment on lines +49 to +52
if r.ContentLength > 0 {
span := otrace.SpanFromContext(r.Context())
span.SetAttributes(semconv.HTTPRequestContentLength(int(r.ContentLength)))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential int64-to-int truncation for very large content lengths.

r.ContentLength and res.ContentLength are int64, but HTTPRequestContentLength and HTTPResponseContentLength accept int. On 32-bit systems (or when content exceeds math.MaxInt), this could truncate. While rare in practice, consider checking bounds or using a larger type if supported.

🛡️ Optional defensive check
+import "math"
+
 // otelhttp v0.67.0 no longer emits http.request_content_length on client spans.
 // Set it here for backward compatibility with downstream systems.
-if r.ContentLength > 0 {
+if r.ContentLength > 0 && r.ContentLength <= int64(math.MaxInt) {
 	span := otrace.SpanFromContext(r.Context())
 	span.SetAttributes(semconv.HTTPRequestContentLength(int(r.ContentLength)))
 }

Also applies to: 64-66

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/transport.go` around lines 49 - 52, The code calls
span.SetAttributes with semconv.HTTPRequestContentLength(int(r.ContentLength))
(and similarly for res.ContentLength) which can truncate int64 to int on 32-bit
systems or for very large payloads; update the handling in the functions that
use r.ContentLength and res.ContentLength (e.g., where otrace.SpanFromContext is
used and span.SetAttributes is invoked) to defensively check the int64 value
before casting: if the value fits within int (use math.MaxInt/int64 comparisons)
cast normally, otherwise clamp to math.MaxInt or use a semantic attribute helper
that accepts int64 if available, and include a short comment explaining the
clamp to avoid silent truncation.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
router-tests/telemetry/telemetry_test.go (1)

5524-5524: ⚠️ Potential issue | 🟠 Major

Use the resolved scope lookup here instead of indexing rm.ScopeMetrics.

This reintroduces scope-order coupling immediately after scopeMetric was looked up by name, so the test can fail because export order changed rather than because router.info regressed.

💡 Suggested fix
-			metricdatatest.AssertEqual(t, routerInfoMetric, rm.ScopeMetrics[0].Metrics[6], metricdatatest.IgnoreTimestamp(), metricdatatest.IgnoreExemplars())
+			metricdatatest.AssertEqual(t, routerInfoMetric, *testutils.GetMetricByName(&scopeMetric, "router.info"), metricdatatest.IgnoreTimestamp(), metricdatatest.IgnoreExemplars())

As per coding guidelines, "Sort collections by a stable key before making positional assertions on non-deterministic order items (metrics, spans, etc.)"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_test.go` at line 5524, The test currently
asserts against rm.ScopeMetrics[0].Metrics[6] which reintroduces ordering
coupling; instead use the previously resolved scope lookup (the scopeMetric or
scopeMetrics variable obtained by name) and assert against its Metrics slice
(e.g., scopeMetric.Metrics[...] or the metrics slice returned by the lookup)
when comparing routerInfoMetric with metricdatatest.AssertEqual; locate where
rm.ScopeMetrics is indexed and replace that positional access with the resolved
scope's Metrics to avoid relying on export order.
🧹 Nitpick comments (3)
router-tests/telemetry/telemetry_test.go (2)

7965-7965: Drop the leftover debug print before merging.

This writes to test output on every run and makes already-parallel telemetry logs harder to read.

✂️ Cleanup
-			printAttributeNames(sn[2].Attributes())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_test.go` at line 7965, Remove the leftover
debug print call printAttributeNames(sn[2].Attributes()) from the
telemetry_test.go test; locate the invocation (likely inside the test that
iterates over span nodes or variable sn) and delete that line so the test no
longer writes to stdout during runs and parallel telemetry logs remain clean.

4435-4441: Remove the commented-out assertion block.

assertHasAttributes already replaced these checks, so keeping the old require.Contains calls commented out just adds noise to an already dense test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/telemetry/telemetry_test.go` around lines 4435 - 4441, Remove
the dead commented-out assertions block in telemetry_test.go (the lines with
require.Contains referencing sn[0].Attributes() and otel.Wg* constants) because
assertHasAttributes already covers those checks; delete the commented lines so
only the active assertHasAttributes usage remains and the test is no longer
cluttered by legacy commented assertions.
router/pkg/trace/filtering_provider.go (1)

67-85: Consider preserving StackTrace for completeness.

The rebuild logic preserves timestamp, span kind, new root flag, links, and attributes. However, SpanConfig also has a StackTrace() accessor for WithStackTrace(true). If any caller uses that option, it would be silently dropped.

Since otelhttp doesn't typically set stack trace at span start, this is unlikely to cause issues in practice. Flagging for completeness in case future callers use this option.

Optional fix to preserve StackTrace
 		if links := cfg.Links(); len(links) > 0 {
 			rebuilt = append(rebuilt, oteltrace.WithLinks(links...))
 		}
+		if cfg.StackTrace() {
+			rebuilt = append(rebuilt, oteltrace.WithStackTrace(true))
+		}
 		if len(filtered) > 0 {
 			rebuilt = append(rebuilt, oteltrace.WithAttributes(filtered...))
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/filtering_provider.go` around lines 67 - 85, The rebuild
logic drops the SpanConfig.StackTrace setting so WithStackTrace(true) would be
lost; update the reconstruction in filtering_provider.go to check
cfg.StackTrace() and if true append oteltrace.WithStackTrace() to rebuilt
(before assigning opts = rebuilt), ensuring the StackTrace flag from SpanConfig
is preserved alongside Timestamp, SpanKind, NewRoot, Links, and Attributes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@router-tests/telemetry/telemetry_test.go`:
- Line 5524: The test currently asserts against rm.ScopeMetrics[0].Metrics[6]
which reintroduces ordering coupling; instead use the previously resolved scope
lookup (the scopeMetric or scopeMetrics variable obtained by name) and assert
against its Metrics slice (e.g., scopeMetric.Metrics[...] or the metrics slice
returned by the lookup) when comparing routerInfoMetric with
metricdatatest.AssertEqual; locate where rm.ScopeMetrics is indexed and replace
that positional access with the resolved scope's Metrics to avoid relying on
export order.

---

Nitpick comments:
In `@router-tests/telemetry/telemetry_test.go`:
- Line 7965: Remove the leftover debug print call
printAttributeNames(sn[2].Attributes()) from the telemetry_test.go test; locate
the invocation (likely inside the test that iterates over span nodes or variable
sn) and delete that line so the test no longer writes to stdout during runs and
parallel telemetry logs remain clean.
- Around line 4435-4441: Remove the dead commented-out assertions block in
telemetry_test.go (the lines with require.Contains referencing
sn[0].Attributes() and otel.Wg* constants) because assertHasAttributes already
covers those checks; delete the commented lines so only the active
assertHasAttributes usage remains and the test is no longer cluttered by legacy
commented assertions.

In `@router/pkg/trace/filtering_provider.go`:
- Around line 67-85: The rebuild logic drops the SpanConfig.StackTrace setting
so WithStackTrace(true) would be lost; update the reconstruction in
filtering_provider.go to check cfg.StackTrace() and if true append
oteltrace.WithStackTrace() to rebuilt (before assigning opts = rebuilt),
ensuring the StackTrace flag from SpanConfig is preserved alongside Timestamp,
SpanKind, NewRoot, Links, and Attributes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 33a6d89d-4fb8-491f-b898-d81c80635d85

📥 Commits

Reviewing files that changed from the base of the PR and between d884162 and 26bbb5d.

📒 Files selected for processing (2)
  • router-tests/telemetry/telemetry_test.go
  • router/pkg/trace/filtering_provider.go

Comment thread router/pkg/trace/semconv_processor.go
Comment thread router-tests/security/circuit_breaker_test.go
Comment thread router/pkg/metric/meter.go Outdated
otelhttp.WithMeterProvider(sdkmetric.NewMeterProvider()),
otelhttp.WithSpanNameFormatter(s.SpanNameFormatter),
otelhttp.WithTracerProvider(s.TracerProvider),
otelhttp.WithTracerProvider(&FilteringTracerProvider{TracerProvider: s.TracerProvider}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have this to be configurable per exporter? When exporting toward external services (like newrelic or datadog) I think it would be useful to allow the standard names, to make it easier to integrate with external services.

@Noroth
Copy link
Copy Markdown
Contributor Author

Noroth commented Apr 2, 2026

The current retry logic has some regressions because the request body is not rewinded automatically. As a workaround on each retry we have to rewind the body manually before calling the underlying transport.

I opened an issue in the otel repository and the maintainers are taking a look currently
open-telemetry/opentelemetry-go-contrib#8759

thisisnithin added a commit that referenced this pull request Apr 13, 2026
Being fixed in #2714 which upgrades opentelemetry-go dependencies.
@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added Stale and removed Stale labels Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants